On The Structure of Style Space for Documents

نویسندگان

  • Rhys Price Jones
  • J. Fernando Naveda
  • Paul Roetling
  • Steven J. Harrington
  • Nishant Thakkar
چکیده

We identify three aspects of style pertaining to documents. The first of these we call literary style and it includes the word and sentence constructions and choice of illustrations traditionally associated with authorship. The second we call informative style and it includes formatting and iconic choices that convey additional information such as the document’s genre or corporate identity. The third aspect of style covers the degrees of freedom remaining for the author and is used to convey the author’s intent. Literary style is the realm of academic scholarship and discourse and is beyond the scope of the present article. But corporate and intent style can be quantified by measuring many different attributes. For example, density of text, colorfulness of images, regularity of positioning of images, diversity of font and typeface, all contribute to the document’s overall style. Indeed, we have identified more than 150 different value functions, each of which can be measured, and each of which can contribute to a document’s overall stylistic appearance. Measurement of these value functions effectively places a document as a point in a style space. But the 150 value functions are not independent. A heuristic approach is described for investigating the possibility of finding basis vectors for intent space.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative comparison of mosques of different styles of Iranian-Islamic architecture based on the concept of hierarchy

The hierarchy in architecture is an attempt to express the concept of transition and the gradual aspect of the process of perception. This principle is well-known as one of the fundamental principles in traditional art and is consistent with the hierarchy of being above its material level. This principle proposes, in the order of reaching a space, the fundamental pattern of connection, transfer...

متن کامل

خوشه‌بندی فراابتکاری اسناد فارسی اِکس‌اِم‌اِل مبتنی بر شباهت ساختاری و محتوایی

Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...

متن کامل

The Impact of Lifestyle on Spatial Relations of Aristocratic Qajar Houses in Mazandaran and Golestan

Lifestyle is a concept that generally includes social, economic, and cultural components. It has a great influence on introduction of behavior and attitudes of people in a community. According to various scholars, lifestyle is closely related to the quality of life. An example of lifestyle is quality of location and living space affecting the architecture of house interior. This study aimed at ...

متن کامل

An Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification

In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004